Reproducible and Accurate Matrix Multiplication
نویسندگان
چکیده
Due to non-associativity of floating-point operations and dynamic scheduling on parallel architectures, getting a bit-wise reproducible floating-point result for multiple executions of the same code on different or even similar parallel architectures is challenging. In this paper, we address the problem of reproducibility in the context of matrix multiplication and propose an algorithm that yields both reproducible and accurate results. This algorithm is composed of two main stages: a filtering stage that uses fast vectorized floating-point expansions in conjunction with error-free transformations; an accumulation stage based on Kulisch long accumulators in a high-radix carry-save representation. Finally, we provide implementations and performance results in parallel environments like GPUs.
منابع مشابه
A New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure
The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...
متن کاملAlgebraic adjoint of the polynomials-polynomial matrix multiplication
This paper deals with a result concerning the algebraic dual of the linear mapping defined by the multiplication of polynomial vectors by a given polynomial matrix over a commutative field
متن کاملGEMMbench: a framework for reproducible and collaborative benchmarking of matrix multiplication
The generic matrix-matrix multiplication (GEMM) is arguably the most popular computational kernel of the 20th century. Yet, surprisingly, no common methodology for evaluating GEMM performance has been established over the many decades of using GEMM for comparing architectures, compilers and ninja-class programmers. We introduce GEMMbench, a framework and methodology for evaluating performance o...
متن کاملSHORT-SS4: Error-Free Transformation of Matrix Multiplication by A Posteriori Verification
This paper is concerned with accurate computations for matrix multiplication. An error-free transformation of matrix multiplication is developed by the authors. It transforms a product of two floatingpoint matrices to a sum of several floating-point matrices by using only floating-point arithmetic. This transformation is useful not only for accurate matrix multiplication but also for interval e...
متن کاملAcceleration of a Preconditioning Method for Ill-Conditioned Dense Linear Systems by Use of a BLAS-based Method
We are interested in accurate numerical solutions of ill-conditioned linear systems using floating-point arithmetic. Recently, we proposed a preconditioning method to reduce the condition numbers of coefficient matrices. The method utilizes an LU factorization obtained in working precision arithmetic and requires matrix multiplication in quadruple precision arithmetic. In this note, we aim to a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014